Reinforcement Learning of Visual Features
نویسنده
چکیده
The digital environment has an ever increasing amount smart programs. Programs that also get smarter every day. They help us filtering spam e-mail and they adjust to show us personalized advertisements. These smart programs observe people and serve (other) people. A robot can be seen as a program with a body. Make the program smart enough and it can help us in the real world too. The smartest programs learn from observations to become better at what they do. Reinforcement Learning (RL) is a type of learning that has been successfully applied to solve a variety of learning tasks. RL is learning from experience in sensory changes and rewards. The robot that uses RL tries to optimize the actions it takes to achieve the maximum reward. Most RL algorithms do not scale well to large sensory inputs. Images are very large inputs because each pixel is an input. Therefore algorithms have been created to compress the visual information to abstract representations (Visual Features). Neural Q-learning [12] is such a method. It combines the RL algorithm of Q-learning with Artificial Neural Networks (ANNs). ANNs are networks of neurons that each do a small adjustable calculation. The network can transform the input to more abstract or useful information. The ANN can learn by adjusting and optimizing the calculations until the network creates the desired transformation. Using an ANN is a good method to find complex ways to make visual data more abstract and more compressed. In this theses, Deep Q-learning is tested with a more difficult task combined with a higher world complexity. In the original paper it was tested on ATARI 2600 games and achieved good results. In this thesis Deep Q-learning is tested in a transportation task in a 3D simulation where the learner only has a relatively large first-person perspective image of the robot it controls. The results show that the complexity of the visual information and the relatively long-delayed reinforcements cause an initialization-noise to reinforcement-signal ratio such that the learner was unable to converge the neural network to describe beneficial behavior. What was learned was forgotten faster than the learner could replay the useful experiences. It can be concluded that only scaling the environment complexity with the Neural Q-learning algorithm is not possible. The learning algorithm needs an extension that makes it better able to handle long-delayed rewards with large visual inputs.
منابع مشابه
RRLUFF: Ranking function based on Reinforcement Learning using User Feedback and Web Document Features
Principal aim of a search engine is to provide the sorted results according to user’s requirements. To achieve this aim, it employs ranking methods to rank the web documents based on their significance and relevance to user query. The novelty of this paper is to provide user feedback-based ranking algorithm using reinforcement learning. The proposed algorithm is called RRLUFF, in which the rank...
متن کاملClosed-Loop Learning of Visual Control Policies
In this dissertation, I introduce a general, flexible framework for learning direct mappings from images to actions in an agent that interacts with its surrounding environment. This work is motivated by the paradigm of purposive vision. The original contributions consist in the design of reinforcement learning algorithms that are applicable to visual spaces. Inspired by the paradigm of local-ap...
متن کاملLearning Visual Servoing with Deep Features and Fitted Q-Iteration
Visual servoing involves choosing actions that move a robot in response to observations from a camera, in order to reach a goal configuration in the world. Standard visual servoing approaches typically rely on manually designed features and analytical dynamics models, which limits their generalization capability and often requires extensive application-specific feature and model engineering. In...
متن کاملReinforcement Learning for a Visually-guided Autonomous Underwater Vehicle
Reinforcement learning uses a scalar reward signal and much interaction with the environment to form a policy of correct behavior. We have applied this technique to the problem of developing a controller for an autonomous underwater vehicle and have achieved reliable off-line development of stable controllers. Many important underwater tasks rely upon on visual observation of underwater feature...
متن کاملWeb pages ranking algorithm based on reinforcement learning and user feedback
The main challenge of a search engine is ranking web documents to provide the best response to a user`s query. Despite the huge number of the extracted results for user`s query, only a small number of the first results are examined by users; therefore, the insertion of the related results in the first ranks is of great importance. In this paper, a ranking algorithm based on the reinforcement le...
متن کاملLearning to Save Lives: Using Reinforcement Learning with Environment Features for Efficient Robot Search and Rescue
This project proposes a reinforcement learning approach to robot search and rescue (SAR). The approach uses Q-Learning with a neural network representation to learn collision avoidance, exploration and victim discovery behaviours. Visual features in the environment are used to learn correlations between detected features and the likelihood of victim discovery. The system was built using the ROS...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016